|
Stylistic multiple features mining based on attention network
WU Haiyan, LIU Ying
Journal of Computer Applications
2020, 40 (8):
2171-2181.
DOI: 10.11772/j.issn.1001-9081.2019122204
To solve the problem that it is difficult to mine the features of different registers in large-scale corpus and it needs a lot of professional knowledge and manpower, a method to mine the features of distinguishing different registers automatically was proposed. First, the register was expressed as words, parts-of-speech, punctuations, and their bigrams, syntactic structure as well as multiple combined features. Then, the combination model of attention mechanism and Multi-Layer Perceptron (MLP) (i.e. attention network) was used to classify the registers into novel, news and textbook. And, the important features that were able to help to distinguish the registers were automatically extracted in this process. Finally, through the further analysis of these features, the characteristics of different registers and some linguistic conclusions were obtained. Experimental results show that novel, news, and textbook have significant differences in words, topic words, word dependencies, parts-of-speech, punctuations and syntactic structures, which implies that there will naturally present some diversity in the use of words, parts-of-speech, punctuations, and syntactic structures due to the different communication objects, purposes, contents, and environments when people utilize language.
Reference |
Related Articles |
Metrics
|
|